Skip to content

Agent Genie++: Add README and deployment images#651

Open
nitinaggarwal-databricks wants to merge 9 commits intodatabrickslabs:mainfrom
nitinaggarwal-databricks:agent-genie++
Open

Agent Genie++: Add README and deployment images#651
nitinaggarwal-databricks wants to merge 9 commits intodatabrickslabs:mainfrom
nitinaggarwal-databricks:agent-genie++

Conversation

@nitinaggarwal-databricks
Copy link
Contributor

Summary

Adds documentation and images for the Agent Genie++ app to the agent_genie directory.

Changes

  • README.md – Documentation covering:
    • Overview and key features
    • Architecture and analytics capabilities
    • Deployment instructions (Marketplace and manual)
    • Troubleshooting (OAuth scopes, permissions)
    • User permissions and monitoring
  • images/ – Screenshots for the deployment workflow and app features

Testing

  • Verified README renders correctly
  • Image paths reference ./images/media/ correctly

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces the Agent Genie++ Databricks App assets under agent_genie/ (backend, UI, deployment manifests, and documentation), and also removes the legacy dbsql/serverless_migration_dash documentation.

Changes:

  • Add Agent Genie++ FastAPI backend and web UI (templates) to support Genie + Serving Endpoint workflows, plus PDF upload/Q&A.
  • Add deployment/config artifacts (databricks.yml, app.yaml, manifest.yaml) and Python dependencies (requirements.txt).
  • Add extensive agent_genie/README.md with screenshots; remove dbsql/serverless_migration_dash/README.md (directory appears removed).

Reviewed changes

Copilot reviewed 18 out of 34 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
manifest.mf Adds a root manifest file.
dbsql/serverless_migration_dash/README.md Removes the Serverless Migration Dash README (and likely the directory).
agent_genie/README.md Adds end-user/deployment documentation and screenshots.
agent_genie/app.py Adds the FastAPI app implementing Genie queries, AI-function routing, and PDF session handling.
agent_genie/templates/index.html Adds the UI (chat, table rendering, PDF upload, feedback capture).
agent_genie/helper.py Adds Genie polling + OAuth token minting and optional Tavily helper.
agent_genie/table_extraction.py Adds UC table/column introspection helpers via REST calls.
agent_genie/my_prompts.py Adds prompt templates for column selection, query rephrasing, and classification.
agent_genie/manual_ai_content.py Adds reference content used to construct AI-function prompts.
agent_genie/requirements.txt Adds Python dependencies for the app.
agent_genie/databricks.yml Adds Databricks bundle configuration for app + resources + installer job.
agent_genie/app.yaml Adds app runtime command/env bindings for Apps deployment.
agent_genie/manifest.yaml Adds app resource specs and user API scopes.
Comments suppressed due to low confidence (6)

agent_genie/templates/index.html:2574

  • uploadPdf() throws new Error(data.detail || 'Upload failed'), but the /upload_pdf backend returns errors under the error key (and only uses detail for FastAPI-raised HTTPExceptions). This drops useful error messages for most failure cases (encrypted/invalid PDFs, etc.). Please align the frontend to read data.error as well, or standardize backend error responses to always use detail.
                try {
                    const res = await fetch('/upload_pdf', { method: 'POST', body: form });
                    const data = await res.json();
                    if (!res.ok) throw new Error(data.detail || 'Upload failed');
                    

agent_genie/table_extraction.py:56

  • Both requests.get(...) calls are made without a timeout. In production this can hang worker threads indefinitely if the Databricks control plane or network stalls. Please set a reasonable timeout (and consider retry/backoff for transient errors) for these UC API calls.
        # Make the request
        response = requests.get(endpoint, headers=headers, params=params)
        

agent_genie/table_extraction.py:110

  • get_table_columns() uses requests.get(...) without a timeout, so a stalled request can hang the app. Please add a timeout (and ideally handle common HTTP errors/429s with retries/backoff).
        # Make the request
        response = requests.get(endpoint, headers=headers)
        

agent_genie/app.py:5

  • The PR description says this change only adds documentation and images, but the diff introduces a full FastAPI app (app.py/templates), dependency set (requirements.txt), deployment configs (databricks.yml/app.yaml/manifest.yaml), and also removes the dbsql/serverless_migration_dash README/dir. Please update the PR description/title to match the actual scope, or split this into separate PRs (docs/images vs application code vs dbsql cleanup) to make review/rollback safer.
from fastapi import FastAPI, Request, UploadFile, File, HTTPException
from fastapi.responses import JSONResponse
from fastapi.templating import Jinja2Templates
import pandas as pd
import logging

agent_genie/requirements.txt:18

  • openai is declared twice in this requirements file (once earlier and again here), which can confuse dependency management and makes intended constraints unclear. Please deduplicate the entry (and consider adding consistent version constraints).
python-dotenv
openai
backoff

agent_genie/templates/index.html:1352

  • The UI hard-codes an individual email address as the support contact. This is brittle for reuse/redistribution and can create privacy/ownership issues. Consider replacing it with a generic support alias or injecting it via configuration/env so it can be set per deployment.
        <div class="section-description">
            A multi-agent AI engine that answers questions, forecasts trends, classifies data, summarizes records, translates languages, and recommends next-best actions. For any questions or feedback, please reach out to nitin.aggarwal@databricks.com 
        </div>

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +74 to +95

# --- Installer job to persist Tavily key into a secret and write optional config ---
jobs:
install_app:
name: "${var.project_name} - Install/Configure"
tasks:
- task_key: configure_app
notebook_task:
notebook_path: "./notebooks/setup_app" # create this notebook
base_parameters:
TAVILY_API_KEY: "${var.tavily_api_key}"
SECRET_SCOPE: "${var.secret_scope}"
SECRET_KEY: "${var.secret_key}"
# Add compute for your workspace (example placeholders):
# existing_cluster_id: "<your-cluster-id>"
# OR:
# job_clusters:
# - job_cluster_key: "install_cluster"
# new_cluster:
# spark_version: "14.3.x-scala2.12"
# node_type_id: "i3.xlarge"
# num_workers: 0
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This installer job task points at ./notebooks/setup_app, but that notebook path is not present in the repo/PR. Bundle deployment will fail when it tries to create/run this job; please add the notebook or remove/disable the installer job block until implemented.

Suggested change
# --- Installer job to persist Tavily key into a secret and write optional config ---
jobs:
install_app:
name: "${var.project_name} - Install/Configure"
tasks:
- task_key: configure_app
notebook_task:
notebook_path: "./notebooks/setup_app" # create this notebook
base_parameters:
TAVILY_API_KEY: "${var.tavily_api_key}"
SECRET_SCOPE: "${var.secret_scope}"
SECRET_KEY: "${var.secret_key}"
# Add compute for your workspace (example placeholders):
# existing_cluster_id: "<your-cluster-id>"
# OR:
# job_clusters:
# - job_cluster_key: "install_cluster"
# new_cluster:
# spark_version: "14.3.x-scala2.12"
# node_type_id: "i3.xlarge"
# num_workers: 0

Copilot uses AI. Check for mistakes.
try {
// Try to parse and pretty-print JSON
const parsed = JSON.parse(content);
formatted = '<pre>' + JSON.stringify(parsed, null, 2) + '</pre>';
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential XSS: when JSON parsing succeeds, formatted is replaced with <pre> + JSON.stringify(...) without escaping, and the caller later assigns it via innerHTML. If JSON values contain HTML, it can execute in the page. Please escape the JSON string (or render via textContent on a created <pre> element) before inserting into the DOM.

Suggested change
formatted = '<pre>' + JSON.stringify(parsed, null, 2) + '</pre>';
const jsonString = JSON.stringify(parsed, null, 2);
const escapedJson = escapeHtml(jsonString);
formatted = '<pre>' + escapedJson + '</pre>';

Copilot uses AI. Check for mistakes.
Comment on lines +977 to +979
try:
tables_result = get_tables(catalog_name, schema_name, fallback_workspace_url, None)
if tables_result["success"]:
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_tables(...) is called with access_token=None, which forces table_extraction.py to fall back to ACCESS_TOKEN from the environment. In Databricks Apps/OBO mode you already mint an OAuth token via Config() (fallback_access_token), so schema discovery can fail even when auth is available. Please pass the minted token into this call (or refactor table_extraction to use the SDK).

Copilot uses AI. Check for mistakes.
for table in tables_result["tables"]:
table_name = table["name"]
try:
columns_result = get_table_columns(catalog_name, schema_name, table_name, fallback_workspace_url, None)
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_table_columns(...) is called with access_token=None, which again falls back to ACCESS_TOKEN from env. To make this work reliably in OBO environments, pass fallback_access_token (or a freshly minted token) into this call as well.

Suggested change
columns_result = get_table_columns(catalog_name, schema_name, table_name, fallback_workspace_url, None)
columns_result = get_table_columns(catalog_name, schema_name, table_name, fallback_workspace_url, fallback_access_token)

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@pohlposition pohlposition left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@alexott alexott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it removes files in other directories, like dbsql/serverless_migration_dash?

manifest (1).mf Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need a copy of the file? Which of them is correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am only working on the dir 'agent_genie'. So, I am not sure why it removed dbsql/serverless_migration_dash. Manifest file copy is again a mistake. Let me fix both

@alexott alexott requested a review from Copilot March 5, 2026 19:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@nitinaggarwal-databricks
Copy link
Contributor Author

Summary of fixes: restored dbsql/serverless_migration_dash, removed duplicate manifest, addressed Copilot (token, XSS, installer job). Ran [brief list of what you tested]. Ready for re-review.

@alexott – could you re-review when you have a moment?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this is removed as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file isn't necessary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants